The Golden Estimator : E cient Range Query
نویسندگان
چکیده
Query size estimation is crucial for many database system components. In particular, query optimizers need eecient and accurate query size estimation when deciding among alternative query plans. In this paper we propose the Golden Estimator, which is based on the so called golden rule of sampling proposed by von Neumann, for estimating the size of single dimensional range queries. The Golden Estimator randomly samples the frequency domain using the cumulative frequency distribution. We argue why this approach will yield good estimates irrespective of the actual underlying distribution of values. We then experimentally show that the Golden Estimator gives better approximation than state of the art histogram based and wavelet based approaches under the same space requirement.
منابع مشابه
Estimating Join Selectivities using Bandwidth-Optimized Kernel Density Models
Accurately predicting the cardinality of intermediate plan operations is an essential part of any modern relational query optimizer. The accuracy of said estimates has a strong and direct impact on the quality of the generated plans, and incorrect estimates can have a negative impact on query performance. One of the biggest challenges in this field is to predict the result size of join operatio...
متن کاملA new framework for addressing temporal range queries and some preliminary results
Given a set of n objects each characterized by d attributes speci ed at m xed time instances we are interested in the problem of designing space e cient indexing structures such that arbitrary temporal range search queries can be handled e ciently When m our problem reduces to the d dimensional orthogonal search problem We establish e cient data structures to handle several classes of the gener...
متن کاملE¢ cient Regressions via Optimally Combining Quantile Information
We develop a generally applicable framework for constructing e¢ cient estimators of regression models via quantile regressions. The proposed method is based on optimally combining information over multiple quantiles and can be applied to a broad range of parametric and nonparametric settings. When combining information over a xed number of quantiles, we derive an explicit upper bound on the di...
متن کاملSpace-E cient Data Cubes for Dynamic Environments
Data cubes provide aggregate information to support the analysis of the contents of data warehouses and databases. An important tool to analyze data in data cubes is the range query. For range queries that summarize large regions of massive data cubes, computing the query result on-they can result in non-interactive response times (e.g., in the order of minutes). To speed up range queries, valu...
متن کاملQuerying with Xcerpt: ¿eory, Complexity, and Algorithms
Applications and services that access Web data are becoming increasingly more useful and wide-spread. Web query languages provide e cient and e ective means to access and process data published on the Web. Xcerpt is a particular breed of Web query languages tailored to versatile data access for RDF, XML, and other Web representation formats. Its rich constructs and reasoning capabilities make i...
متن کامل